Recent Advances in Information Systems and Technologies by Álvaro Rocha Ana Maria Correia Hojjat Adeli Luís Paulo Reis & Sandra Costanzo
Author:Álvaro Rocha, Ana Maria Correia, Hojjat Adeli, Luís Paulo Reis & Sandra Costanzo
Language: eng
Format: epub
Publisher: Springer International Publishing, Cham
Keywords
SecurityInsider attacksBig dataAuthenticationHadoop
1 Introduction
The exponential growth of data in every aspect of our lives and in enterprises across the world demands to draw value from data. In 2013, five exabytes of data were created each day in various sizes and formats, from sensors, individual archives, social networks, IoT (Internet of Things), and companies [1]. One of the most challenging issues is: how to effectively manage such a large amount of data and identify new ways to analyze large amounts of data to get value from it. Big Data technologies are a step forward in handling this problem. The early version of the big data concept has been described in 2001 in the Gartner report by Laney [2], and big data was defined as large and complex data sets that current computing facilities were not able to handle. It is characterized by 3Vs (Volume, Velocity, and Variety). Additionally, some new Vs have been added by some organizations to further define big data, characteristics as “Veracity”, and “Value” [3] brought more diffusion to the characterization of big data. With the popularity of these systems, the repositories are increasingly likely to be stored with sensitive data and, as usual, we need to secure it properly. There is no skepticism that new frameworks to analyze data can provide a robust foundation for a new generation of analytics and perception, but it is important to consider security before launching or expanding a big data platform. The complexity and variety of these systems must have a comprehensive approach with the security of the entire big data systems [4]. Hadoop systems, by default, are insecure, since customers are deploying them quickly without proper controls, and this can provoke serious errors that can lead to an organizational disaster. Such systems are particularly exposed to insider attacks. The aim of this paper, is to analyze if big data systems administrators are concerned with security and privacy of users system. For this, we show the results of a survey aimed at big data administrators, with some questions that allow us to draw conclusions about the issue of safety in these systems. Additionally, we provide foundation towards the security on big data platforms, and in particular in Apache Hadoop, and show what an insider attacker can do when have access to a network with a non-secure Hadoop cluster. The structure of this paper is organized as follows. The Sect. 2 discusses some related work on security in big data platforms. In Sect. 3 is described the Apache Hadoop platform and its security model. Section 4 presents some attacks that can be performed by an insider user, in a non-secure Hadoop environment. In Sect. 5, we disclose the results of a survey on what platforms big data administrators are working and if security is configured appropriately. Section 6 presents the results of the benchmark tests performed to evaluate the performance impact with the activation of encryption. These results help us to understand the impact of these security measures. Finally, Sect. 7 concludes the paper and proposes future work.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8332)
Test-Driven Development with Java by Alan Mellor(7007)
Data Augmentation with Python by Duc Haba(6928)
Principles of Data Fabric by Sonia Mezzetta(6663)
Learn Blender Simulations the Right Way by Stephen Pearson(6572)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(6435)
Hadoop in Practice by Alex Holmes(5973)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5829)
RPA Solution Architect's Handbook by Sachin Sahgal(5827)
The Infinite Retina by Robert Scoble Irena Cronin(5524)
Big Data Analysis with Python by Ivan Marin(5501)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5182)
Pretrain Vision and Large Language Models in Python by Emily Webber(4462)
Infrastructure as Code for Beginners by Russ McKendrick(4250)
Functional Programming in JavaScript by Mantyla Dan(4059)
The Age of Surveillance Capitalism by Shoshana Zuboff(3979)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3958)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3759)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3732)
